Adaptive Hausdorff distances and dynamic clustering of symbolic interval data

نویسندگان

Francisco de A. T. de Carvalho

Renata M. C. R. de Souza

Marie Chavent

Yves Lechevallier

چکیده

This paper presents a partitional dynamic clustering method for interval data based on adaptive Hausdorff distances. Dynamic clustering algorithms are iterative two-step relocation algorithms involving the construction of the clusters at each iteration and the identification of a suitable representation or prototype (means, axes, probability laws, groups of elements, etc.) for each cluster by locally optimizing an adequacy criterion that measures the fitting between the clusters and their corresponding representatives. In this paper, each pattern is represented by a vector of intervals. Adaptive Hausdorff distances are the measures used to compare two interval vectors. Adaptive distances at each iteration change for each cluster according to its intra-class structure. The advantage of these adaptive distances is that the clustering algorithm is able to recognize clusters of different shapes and sizes. To evaluate this method, experiments with real and synthetic interval data sets were performed. The evaluation is based on an external cluster validity index (corrected Rand index) in a framework of a Monte Carlo experiment with 100 replications. These experiments showed the usefulness of the proposed method. 2005 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptative Hausdorff Distances and Dynamic Clustering of Symbolic Interval Data

متن کامل

Hausdorff Distance Measure Based Interval Fuzzy Possibilistic C-Means Clustering Algorithm

Clustering algorithms have been widely used artificial intelligence, data mining and machine learning, etc. It is unsupervised classification and is divided into groups according to data sets. That is, the data sets of similarity partition belong to the same group; otherwise data sets divide other groups in the clustering algorithms. In general, to analysis interval data needs Type II fuzzy log...

متن کامل

Clustering Interval-valued Data Using an Overlapped Interval Divergence

As a common problem in data clustering applications, how to identify a suitable proximity measure between data instances is still an open problem. Especially when interval-valued data is becoming more and more popular, it is expected to have a suitable distance for intervals. Existing distance measures only consider the lower and upper bounds of intervals, but overlook the overlapped area betwe...

متن کامل

Fuzzy c-means clustering methods for symbolic interval data

This paper presents adaptive and non-adaptive fuzzy c-means clustering methods for partitioning symbolic interval data. The proposed methods furnish a fuzzy partition and prototype for each cluster by optimizing an adequacy criterion based on suitable squared Euclidean distances between vectors of intervals. Moreover, various cluster interpretation tools are introduced. Experiments with real an...

متن کامل

Multidimensional Interval-Data: Metrics and Factorial Analysis

Statistical units described by interval-valued variables represent a special case of Symbolic Objects, where all descriptors are quantitative variables. In this context, the paper presents two different metrics in R for interval-valued data that are based on the definition of the Hausdorff distance in R. Hausdorff distance in R (for any p ≥ 1) is a L∞ norm between pairs of closed sets. However,...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Pattern Recognition Letters

دوره 27 شماره

صفحات -

تاریخ انتشار 2006

Adaptive Hausdorff distances and dynamic clustering of symbolic interval data

نویسندگان

چکیده

منابع مشابه

Adaptative Hausdorff Distances and Dynamic Clustering of Symbolic Interval Data

Hausdorff Distance Measure Based Interval Fuzzy Possibilistic C-Means Clustering Algorithm

Clustering Interval-valued Data Using an Overlapped Interval Divergence

Fuzzy c-means clustering methods for symbolic interval data

Multidimensional Interval-Data: Metrics and Factorial Analysis

عنوان ژورنال:

اشتراک گذاری